Goto

Collaborating Authors

 Karbala Governorate


VLCE: A Knowledge-Enhanced Framework for Image Description in Disaster Assessment

Rahman, Md. Mahfuzur, Gupta, Kishor Datta, Kamal, Marufa, Rahman, Fahad, Siddique, Sunzida, Hasan, Ahmed Rafi, Haque, Mohd Ariful, George, Roy

arXiv.org Artificial Intelligence

The processes of classification and segmentation utilizing artificial intelligence play a vital role in the automation of disaster assessments. However, contemporary VLMs produce details that are inadequately aligned with the objectives of disaster assessment, primarily due to their deficiency in domain knowledge and the absence of a more refined descriptive process. This research presents the Vision Language Caption Enhancer (VLCE), a dedicated multimodal framework aimed at integrating external semantic knowledge from ConceptNet and WordNet to improve the captioning process. The objective is to produce disaster-specific descriptions that effectively convert raw visual data into actionable intelligence. VLCE utilizes two separate architectures: a CNN-LSTM model that incorporates a ResNet50 backbone, pretrained on EuroSat for satellite imagery (xBD dataset), and a Vision Transformer developed for UAV imagery (RescueNet dataset). In various architectural frameworks and datasets, VLCE exhibits a consistent advantage over baseline models such as LLaVA and QwenVL. Our optimal configuration reaches an impressive 95.33\% on InfoMetIC for UAV imagery while also demonstrating strong performance across satellite imagery. The proposed framework signifies a significant transition from basic visual classification to the generation of comprehensive situational intelligence, demonstrating immediate applicability for implementation in real-time disaster assessment systems.


Utilizing a Novel Deep Learning Method for Scene Categorization in Remote Sensing Data

Omran, Ghufran A., Hayale, Wassan Saad Abduljabbar, AlRababah, Ahmad AbdulQadir, Al-Barazanchi, Israa Ibraheem, Sekhar, Ravi, Shah, Pritesh, Parihar, Sushma, Penubadi, Harshavardhan Reddy

arXiv.org Artificial Intelligence

Scene categorization (SC) in remotely acquired images is an important subject with broad consequences in different fields, including catastrophe control, ecological observation, architecture for cities, and more. Nevertheless, its several apps, reaching a high degree of accuracy in SC from distant observation data has demonstrated to be difficult. This is because traditional conventional deep learning models require large databases with high variety and high levels of noise to capture impor tant visual featu res. To address these problems, this investigation file introduces an innovative technique referred to as the Cuttlefish Optimized Bidirectional Recurrent Neural Network (CO - BRNN) for type of scenes in remote sensing data. The investigation compares the execution of CO - BRNN with current techniques, including Multilayer Perceptron - Convolutional Neural Network (MLP - CNN), Convolutional Neural Network - Long Short Term Memory ( CNN - LSTM), and Long Short Term Memory - Conditional Random Field (LSTM - CRF), Graph - Based (GB), Multilabel Image Retrieval Model (MIRM - CF), Convolutional Neural Networks Data Augmentation (CNN - DA). The results demonstrate that CO - BRNN attained the maximum accuracy of 97%, followed by LSTM - CRF with 90%, MLP - CNN with 85%, and CNN - LSTM with 80%. The study highlights the significance of physical confirmation to ensure the efficiency of satellite data.


Leveraging Ensemble-Based Semi-Supervised Learning for Illicit Account Detection in Ethereum DeFi Transactions

Fazliani, Shabnam, Sorond, Mohammad Mowlavi, Masoudifard, Arsalan

arXiv.org Artificial Intelligence

The advent of smart contracts has enabled the rapid rise of Decentralized Finance (DeFi) on the Ethereum blockchain, offering substantial rewards in financial innovation and inclusivity. However, this growth has also introduced significant security risks, including the proliferation of illicit accounts involved in fraudulent activities. Traditional detection methods are limited by the scarcity of labeled data and the evolving tactics of malicious actors. In this paper, we propose a novel Self-Learning Ensemble-based Illicit account Detection (SLEID) framework to address these challenges. SLEID employs an Isolation Forest for initial outlier detection and a self-training mechanism to iteratively generate pseudo-labels for unlabeled accounts, thereby enhancing detection accuracy. Extensive experiments demonstrate that SLEID significantly outperforms traditional supervised approaches and recent semi-supervised models, achieving superior precision, recall, and F1-scores, particularly in detecting illicit accounts. Compared to state-of-the-art methods, our approach achieves better detection performance while reducing reliance on labeled data. The results affirm SLEID's efficacy as a robust solution for safeguarding the DeFi ecosystem and mitigating risks posed by malicious accounts.


Advancing Content Moderation: Evaluating Large Language Models for Detecting Sensitive Content Across Text, Images, and Videos

AlDahoul, Nouar, Tan, Myles Joshua Toledo, Kasireddy, Harishwar Reddy, Zaki, Yasir

arXiv.org Artificial Intelligence

The widespread dissemination of hate speech, harassment, harmful and sexual content, and violence across websites and media platforms presents substantial challenges and provokes widespread concern among different sectors of society. Governments, educators, and parents are often at odds with media platforms about how to regulate, control, and limit the spread of such content. Technologies for detecting and censoring the media contents are a key solution to addressing these challenges. Techniques from natural language processing and computer vision have been used widely to automatically identify and filter out sensitive content such as offensive languages, violence, nudity, and addiction in both text, images, and videos, enabling platforms to enforce content policies at scale. However, existing methods still have limitations in achieving high detection accuracy with fewer false positives and false negatives. Therefore, more sophisticated algorithms for understanding the context of both text and image may open rooms for improvement in content censorship to build a more efficient censorship system. In this paper, we evaluate existing LLM-based content moderation solutions such as OpenAI moderation model and Llama-Guard3 and study their capabilities to detect sensitive contents. Additionally, we explore recent LLMs such as GPT, Gemini, and Llama in identifying inappropriate contents across media outlets. Various textual and visual datasets like X tweets, Amazon reviews, news articles, human photos, cartoons, sketches, and violence videos have been utilized for evaluation and comparison. The results demonstrate that LLMs outperform traditional techniques by achieving higher accuracy and lower false positive and false negative rates. This highlights the potential to integrate LLMs into websites, social media platforms, and video-sharing services for regulatory and content moderation purposes.


TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks

Rasheed, Areeg Fahad, Zarkoosh, M., Abbas, Safa F., Al-Azzawi, Sana Sabah

arXiv.org Artificial Intelligence

This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost. To tackle this issue, a novel dataset containing a total of 4,112 programming tasks was created by extracting tasks from various websites. Web scraping techniques were employed to collect this dataset of programming problems systematically. Specific HTML tags were tracked to extract key elements of each issue, including the title, problem description, input-output, examples, problem class, and complexity score. Examples from the dataset are provided in the appendix to illustrate the variety and complexity of tasks included. The dataset's effectiveness has been evaluated and benchmarked using two approaches; the first approach involved fine-tuning the FLAN-T5 small model on the dataset, while the second approach used in-context learning (ICL) with the GPT-4o mini. The performance was assessed using standard metrics: accuracy, recall, precision, and F1-score. The results indicated that in-context learning with GPT-4o-mini outperformed the FLAN-T5 model.


A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

Wang, Yao, Liu, Xin, Kong, Weikun, Yu, Hai-Tao, Racharak, Teeradaj, Kim, Kyoung-Sook, Nguyen, Minh Le

arXiv.org Artificial Intelligence

Named Entity Recognition and Relation Extraction are two crucial and challenging subtasks in the field of Information Extraction. Despite the successes achieved by the traditional approaches, fundamental research questions remain open. First, most recent studies use parameter sharing for a single subtask or shared features for both two subtasks, ignoring their semantic differences. Second, information interaction mainly focuses on the two subtasks, leaving the fine-grained informtion interaction among the subtask-specific features of encoding subjects, relations, and objects unexplored. Motivated by the aforementioned limitations, we propose a novel model to jointly extract entities and relations. The main novelties are as follows: (1) We propose to decouple the feature encoding process into three parts, namely encoding subjects, encoding objects, and encoding relations. Thanks to this, we are able to use fine-grained subtask-specific features. The experimental results demonstrate that our model outperforms several previous state-of-the-art models. Extensive additional experiments further confirm the effectiveness of our model. A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations Introduction Named Entity Recognition (NER) and Relation Extraction (RE), as two essential subtasks in information extraction, aim to extract entities and relations from semi-structured and unstructured texts. They are used in many downstream applications in different domains, such as knowledge graph construction [38, 39], Question-Answering [36, 37], and knowledge graph-based recommendation system [40, 41]. Most traditional models and some methods used in specialized areas [9,35,43,46] construct separate models for NER and RE to extract entities and relations in a pipelined manner. This type of method suffers from error propagation and unilateral information interaction.


The Effects of Political Martyrdom on Election Results: The Assassination of Abe

Takagi, Miu Nicole

arXiv.org Artificial Intelligence

In developed nations assassinations are rare and thus the impact of such acts on the electoral and political landscape is understudied. In this paper, we focus on Twitter data to examine the effects of Japan's former Primer Minister Abe's assassination on the Japanese House of Councillors elections in 2022. We utilize sentiment analysis and emotion detection together with topic modeling on over 2 million tweets and compare them against tweets during previous election cycles. Our findings indicate that Twitter sentiments were negatively impacted by the event in the short term and that social media attention span has shortened. We also discuss how "necropolitics" affected the outcome of the elections in favor of the deceased's party meaning that there seems to have been an effect of Abe's death on the election outcome though the findings warrant further investigation for conclusive results.. Keywords Japanese House of Councillors Elections; Abe assassination; sentiment analysis ...


Semi-WTC: A Practical Semi-supervised Framework for Attack Categorization through Weight-Task Consistency

Li, Zihan, Chen, Wentao, Wei, Zhiqing, Luo, Xingqi, Su, Bing

arXiv.org Artificial Intelligence

Supervised learning has been widely used for attack categorization, requiring high-quality data and labels. However, the data is often imbalanced and it is difficult to obtain sufficient annotations. Moreover, supervised models are subject to real-world deployment issues, such as defending against unseen artificial attacks. To tackle the challenges, we propose a semi-supervised fine-grained attack categorization framework consisting of an encoder and a two-branch structure and this framework can be generalized to different supervised models. The multilayer perceptron with residual connection is used as the encoder to extract features and reduce the complexity. The Recurrent Prototype Module (RPM) is proposed to train the encoder effectively in a semi-supervised manner. To alleviate the data imbalance problem, we introduce the Weight-Task Consistency (WTC) into the iterative process of RPM by assigning larger weights to classes with fewer samples in the loss function. In addition, to cope with new attacks in real-world deployment, we propose an Active Adaption Resampling (AAR) method, which can better discover the distribution of unseen sample data and adapt the parameters of encoder. Experimental results show that our model outperforms the state-of-the-art semi-supervised attack detection methods with a 3% improvement in classification accuracy and a 90% reduction in training time.


NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

Abdul-Mageed, Muhammad, Zhang, Chiyu, Bouamor, Houda, Habash, Nizar

arXiv.org Artificial Intelligence

We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI). This Shared Task includes two subtasks: country-level dialect identification (Subtask 1) and province-level sub-dialect identification (Subtask 2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain. As such, NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level. A total of 61 teams from 25 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams.


How Much Can New AI Tell Us About Ancient Times?

#artificialintelligence

Many researchers hope that AI will leading to a"golden age" of discovery for lost languages, hard to decipher writings, and badly damaged Biblical scrolls. Algorithms can chug through vast numbers of possibilities of interpretation, presenting the scholar with probabilities to choose from. But even powerful algorithms have their work cut out for them. For example, of the hundreds of thousands of clay (cuneiform) tablets that survive from an ancient part of the Near East called Mesopotamia, many are damaged. We may know the language but we don't know what's missing from the text and what difference the missing part makes to what is being said.